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Abstract 

This paper discusses how to find the global minimum of functions that are summations of 
small polynomials ("small" means involving a small number of variables). Some sparse sum of 
squares (SOS) techniques are proposed. We compare their computational complexity and lower 
bounds with prior SOS relaxations. Under certain conditions, we also discuss how to extract the 
global minimizers from these sparse relaxations. The proposed methods are especially useful in 
solving sparse polynomial system and nonlinear least squares problems. Numerical experiments 
are presented, which show that the proposed methods significantly improve the computational 
performance of prior methods for solving these problems. Lastly, we present applications of this 
sparsity technique in solving polynomial systems derived from nonlinear differential equations and 
sensor network localization. 
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1 Introduction 

Global optimization of multivariate polynomial functions contains quite a broad class of optimiza- 
tion problems. It has wide and important applications in science and engineering. Recently, there 
has been much work on globally minimizing polynomial functions using representation theorems 
from real algebraic geometry for positive polynomials. The basic idea is to approximate nonneg- 
ative polynomials by sum of squares (SOS) polynomials. This approximation is also called SOS 
relaxation, since not every nonnegative polynomial is SOS. Here a polynomial is said to be SOS 
if it can be written as a sum of squares of other polynomials. The advantage of SOS polynomials 
is that a polynomial is SOS if and only if a certain semidefinite program (SDP) formed by its 
coefficients is feasible. Since SDP "36 has efficient numerical methods, we can check whether a 
polynomial is SOS by solving a particular SDP. 

To be more specific, suppose we wish to find the global minimum value /* of a polynomial 
function f{x) of vector x = {xi,X2, ■ ■ ■ , Xn) € M". The SOS relaxation finds a lower bound 7 for 
/* such that the polynomial f{x) — 7 is SOS. Obviously, f{x) — 7 being SOS implies that f{x) — 7 
is nonnegative for every real vector x. Hence such a 7 is a lower bound. The maximum 7 found 
this way is called the SOS lower bound, which is often denoted by f*os- The relation /^o^ < /* 
always holds for every polynomial f{x) (it is possible that f*^g = —00). When f*^g = /*, we 
say the SOS relaxation is exact. We refer to [161 1251 126] for more details on SOS relaxations for 
polynomial optimization problems. There are two important issues for applying SOS relaxation 
in global optimization of polynomial functions: the quality and computational complexity. 
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The quality means how good is the SOS lower bound fsoa- In practice, as observed in [25], many 
nonnegative polynomials that are not "artificially" constructed are SOS. However, Blekherman 
[3 pointed out that there are much more nonnegative polynomials than SOS polynomials. But 
usually SOS relaxation provides very good approximations, although theoretically it can fail with 
high probability. When SOS relaxation is not exact, i.e., f*„g < /*, there are methods to fix 
it by applying modified SOS relaxations. Nie, Demmel and Sturmfels [23] proposed to use SOS 
representations of f{x) — j modulo the gradient ideal of f{x), and show that the minimum value 
/* can be obtained when /* is attained at some point. Schweighofer [31] proposes to minimize 
f{x) over a semialgebraic set called the gradient tentacle, and shows that the minimum value /* 
can be computed when /* > — co but not attainable. Jibetean and Laurent [13 1 . and Lasserre 
[17j propose perturbing f{x) by adding a higher degree polynomial with tiny coefficients, and 
showed that the lower bounds will converge to the minimum value /*. Recently, Laurent [19] 
gave a survey on solving polynomial optimization by using semidefinite relaxations. We refer to 
[HI [13 [H [^ [3T[ for related work. 

Another important issue for SOS relaxation is the computational complexity. Suppose f{x) 
has degree 2d (it must be even for f{x) to have a finite minimum). Then f{x) has up to ("J^"^) 
monomials. The condition that f{x) — 7 being SOS reduces to an SDP the size of whose linear 
matrix inequality (LMI) is ("^'') with ("J^'') variables. These numbers can be huge for moderate 
n and d, say, n = 2d = 10. For large scale polynomial optimization problems, the general 
SOS relaxation is very difficult to implement numerically. Sometimes this complexity makes the 
applicability of SOS relaxation very limited. We refer to [251 126] for the connection between SOS 
relaxation and SDP. 

Prior work There is some work on exploiting sparsity in polynomial optimization when the 
polynomials are sparse. In such situations, sparse SOS relaxations are available and the resulting 
SDPs have reduced sizes, and hence larger problems can be solved. Here being sparse means that 
the number of monomials with nonzero coefficients is much smaller than the maximum possible 
number ("JJ'') . Kojima et al. [15] and Parrilo 127: discussed how to exploit sparsity of SOS 
relaxations in unconstrained polynomial optimization. Kim et al. [14] and Lasserre [18] discussed 
sparse SOS relaxations for constrained polynomial optimization problems and showed convergence 
under certain conditions. Waki et al. [34] proposed a heuristic procedure to exploit sparsity for 
minimizing polynomials by chordal extension of the correlation sparsity pattern graph (csp graph): 
the vertices of the csp graph are the variables xi, ■ • ■ ,x„; the edge {xi,Xj) exists whenever XiXj 
appears in one monomial of f{x). To find one chordal extension, '34 proposed to use the symbolic 
sparse Cholesky factorization of the csp matrix with minimum degree ordering. If the chordal 
extension of the csp graph is also sparse, then the sparsity technique in [34] works well. However, 
if the chordal extension of the csp graph is much less sparse, then that sparsity technique might 
still be too expensive to be implementable for some practical problems. 

Our contributions In many practical applications, the polynomials are not only sparse, but also 
given with certain sparsity patterns. For instance, the polynomials are often summations of other 
"small" polynomials, i.e. polynomials only involving a small number of variables. Sometimes, 
these representations contain useful information that might help us save computations signifi- 
cantly. These sparsity patterns are often ignored in prior work, where these polynomials would 
be treated using the usual "dense" algorithms. The main contribution of this paper is to propose 
new sparse SOS relaxation techniques taking the given sparsity pattern into account, and to show 
numerical experiments demonstrating their accuracy and speed. 

In this paper, we consider the polynomial optimization problem of the form 

m 

^i'i fi^) = J2f^M (1-1) 

1=1 

where Ai C [n] — {1, 2, ■ ■ ■ ,n}. Here each fi{xAi) is a polynomial in x^i ~ {xj\j £ Ai). Let 
deg(/i) = 2di and 2d = deg(/) — max{2di, • • • , 2dm} (we assume each /; has even degree along 
with /). One basic and natural idea for solving problem p.l|l is to find the maximum 7 such that 

m 

/(x) - 7 = ^s^xaJ 
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where each Si{x Ai) is an SOS polynomial in XAi instead of all the variables xi, ■ ■ ■ , a;„. Exploiting 
this sparsity pattern can save significant computation without sacrificing much solution quality 
for many practical problems. In addition to presenting its numerical implementation, this paper 
will also discuss the theoretical properties of this sparse SOS relaxation and its variations. 

The main distinction of our sparse SOS technique from earlier work like Waki et al. is 
that we do not use the chordal extension of csp graphs. In case that the csp graph of f{x) in 
is chordal, our sparsity technique is almost the same as the one in [34]. However, if the csp graph 
of f{x) is not chordal and its chordal extension is much less sparse, then our sparsity technique 
is significantly more efficient. If the csp graph of f{x) is not chordal and its chordal extension 
is also sparse, then our sparsity technique is slightly more efficient while not losing much quality 
of solution. Furthermore, our sparsity technique can be applied to solve bigger dense polynomial 
optimization problems which can not be solved by other existing methods. This is due to the 
observation that every polynomial g{x) is a summation of monomials whose number of variables 
is at most the degree deg{g). So, when deg{g) is small, like 4 or 6, then the formulation p.ip is 
a good sparse model. The numerical computations show that our sparsity technique is usually 
more efficient than other existing methods in solving problems of the form (jLlf) . 

Polynomial optimization problems of the form (|l.ip have important practical applications: 
(i) Solving polynomial systems: Many large scale polynomial equations are often sparse and 
each equation might involve just a few variables, e.g., the polynomial equations obtained from 
discretization in nonlinear differential equations. Such polynomial systems can be equivalently 
transformed to a global polynomial optimization problems of the form We will show that 

the proposed sparse SOS relaxation is exact when the polynomial system has at least one real 
solution, (ii) Nonlinear least squares: Many difficult problems in statistics, biology, engineering 
or other applications require solving certain nonlinear least squares problems and finding their 
global optimal solutions. If each equation is sparse, then sparse polynomial optimization is 
a very natural model and our sparsity technique is very suitable. Sensor network localization is 
one important application of this kind. 

Outline This paper is organized as follows. Section 2 introduces some notation and background 
for SOS relaxations. Section 3 discusses properties of the sparse SOS relaxation and its variations. 
Section 4 presents some numerical implementations, and Section 5 shows applications. Lastly, 
Section 6 draws some conclusions and discusses future work in this area. 

2 Preliminaries 

This section introduces some notations and backgrounds in SOS relaxation methods for minimizing 
polynomial functions. 

Throughout this paper, we will use the following notation: R is the field of real numbers; N 
is the set of nonnegative integers; R"^' = {{xk^, ■ ■ ■ ,Xkf) : Xk- € R} when A; = {fci,A;2, ■ • ■ ,ki}; 
M.[X]: the ring of real polynomials in X = {xi,X2, ■ ■ ■ , Xn); R[XaJ: the ring of real polynomials in 
Xa, = {xk)ksAi; EKM^: SOS polynomials in R[X]; ER[XaJ^: SOS polynomials in R[XaJ; 
X;Riv[X]^: SOS polynomials in R[X] with degree at most 2N; ERjv[^aJ^: SOS polynomials in 

R[XAi] with degree at most 2iV; ||2:||2 = ^x^+x^ H \- xl; x°' = x^^x"^^ • • • a;^" for a G N"; 

supp(q:) = {j £ [n] : ai 7^ 0}; supp(/) = {a £ N" : the coefficient of x" in f{x) is nonzero}; \F\ 
denotes the cardinality of set F\ denotes the transpose of matrix A; Ay (^)O means matrix A 
is positive semidefinite (definite); Aid{y) is the moment matrix of order d about x £ R"; A4'^'{y) 
is the moment matrix of order d about xa^ £ R"^'; Mpiy) is the moment matrix generated 
monomials with support F. 



2.1. SOS and semidefinite programming (SDP) 

A polynomial p{x) in a; = {xi, ■ ■ ■ , a;„) is said to be sum of squares (SOS) if p{x) = Ei pli^) for 
some polynomials Pi{x). Obviously, if p{x) is SOS, then p{x) is nonnegative, i.e., p{x) > for all 
x £ R". However, the converse is not true. lfp{x) is nonnegative, thenp(2;) is not necessarily SOS. 
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In other words, the set of SOS polynomials (which forms a cone) is properly contained in the set of 
nonnegative polynomials (which forms a larger cone). The process of approximating nonnegative 
polynomials by SOS polynomials is called SOS relaxation. For instance, the polynomial 

4 4 4 4 
Xi + X2 + X3 + — '1X1X2X3X4, 

f r / 2 2 2 I 2\2 , / 2 I 2 2 2\2 . / 2 2 2 , 2\2 
= g 1(3^1 - a;2 - 2:4 + 2:3) + (Si + ^2 - ^4 - 3:3) + (Xi - X2 - 2:3 + X4) + 

2{x\Xa — X2Xz)^ + 2{x\X2 — 2:3x4)^ + 2(xia;3 — a;22;4)^} 
is SOS. This identity immediately implies that 

xt + X2 + xfi + X4 ~ 4x1X2x3X4 > 0, V(xi, X2, X3, X4) G R'*, 

which is one arithmetic-geometric mean inequality. 

The advantage of SOS polynomials over nonnegative polynomials is that it is more tractable 
to check whether a polynomial is SOS. To test whether a polynomial is SOS is equivalent to 
testing the feasibility of some SDP [251 126) . which has efficient numerical solvers. To illustrate 
this, suppose polynomial p(x) has degree 2d (SOS polynomials must have even degree). Then 
p{x) is SOS if and only if [251 126| there exists a symmetric matrix W y such that 

p(x) = md(x)^ W md(x) 

where md{x) is the column vector of monomials up to degree d. For instance, 

m2(xi,X2) = [1, Xi, X2, xf, XiX2, X2]^. 

As is well-known, the number of monomials in x up to degree dis {"Y). Thus the size of matrix 
W is ("J''). This number can be very large. For instance, when n — d = 10, ("J'') > 10^. 
However, for fixed d (e.g.,d — 2), ("^'^) is polynomial in n. On the other hand, it is NP-hard 
(with respect to n) to tell whether a polynomial is nonnegative whenever 2ci > 4 (even when d is 
fixed) llSj. 



2.2. SOS relaxation in polynomial optimization 

Let /(x) = "^^a faX°' be a polynomial in x. Consider the global optimization problem 

f* ~ min fix). 

This problem is NP-hard when deg(/) > 4. The standard SOS relaxation is 

faoa := max 7 

s.t. / (x) — 7 is SOS . 

Obviously we have that ftos < /*■ In practice, SOS provides very good approximations, and 
often gives exact global minimum, i.e., /'^^ = /*, even though theoretically there are many more 
nonnegative polynomials than SOS polynomials 5 . 

In terms of SDP, the SOS relaxation can also be written as 

fsos ■- max 7 (2.1) 
s.t. f{x)~'y = ma{xfWma{x) (2.2) 
^ (2.3) 

where 2d — deg(/). The decision variable in the above is (7, W) instead of x. The above program 
is convex about (7, W). A lower bound f^os can be computed by solving the resulting SDP. It 
can be shown [TB] that the dual of PTT|) - (P3)) is 

/mom ■- min f^V" (2-4) 

y — 

\a\<2d 

S.t. Mdiy) t (2.5) 
yo,...,o = l. (2.6) 
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Here Md{y) is the moment mairia; generated hy y — (ya), a vector indexed by monomials of degree 
at most 2d. The rows and columns of moment matrix Md{y) are indexed by integer vectors. Each 
entry of Mci{y) is defined as 

Mi{y){a,l3) ~y^+p, V|a|,|/3| <d. 
For instance, when d = 2 and n = 2, the vector 

y = [yo,o, yi,o, yo,i, 2/2,0, 2/0,2, 2/3,0, y2,i, 2/1,2, 2/0,3, y4,o, 2/3,1, 2/2,2, t/1,3, 2/0,4], 
defines moment matrix 



A^2(2/) = 



2/0,0 


2/1,0 


2/0,1 


2/2,0 


yi,i 


2/0,2 


2/1,0 


2/2,0 


2/1,1 


2/3,0 


2/2,1 


2/1,2 


2/0,1 


2/1,1 


2/0,2 


2/2,1 


yi,2 


2/0,3 


2/2,0 


2/3,0 


2/2,1 


2/4,0 


2/3,1 


2/2,2 


2/1,1 


2/2,1 


yi,2 


2/3,1 


y2,2 


2/1,3 


.2/0,2 


2/1,2 


yo,3 


2/2,2 


2/1,3 


2/0,4. 



For SOS relaxation (f2TT|) - (f2^ and its dual problem (l2^ - ([23)) . strong duality holds [16], i.e., 
their optimal values are equal {fsos ~ fmom)- Hence f^om is also a lower bound for the global 
minimum /* of /(a;). 

Now let us see how to extract minimizer(s) from optimal solutions to (|2.4|l - H2.6p . Let y* be 
one optimal solution. If moment matrix A4d{y*) has rank one, then there exists one vector w such 
that Mdiy") = witF . Normalize w so that W(o,... ,o) ~ 1- Set x* — w{2 : n + 1). Then the relation 
Md{y*) ~ wuF immediately implies that y* — m2d(x*), i.e., i/a = (2:*)", so fmom = fi^*)- This 
says that a lower bound of f{x) is attained at one point x* . So a;* is one global minimizer. 

When moment matrix Md{y*) has rank more than one, the process described above does not 
work. However, if Aid{y*) satisfies the so-called fiat extenston condition 

Ta.nkMkiy*) = rank A^fc+i(2/*) 

for some 0<fc<m — 1, we can extract more than one minimizer (in this case the global solution 
is not unique). When the flat extension condition is met, it can be shown 8 that there exist 
distinct vectors ui, - ■ ■ ,Ur such that 

Md{y*) = Aimd(?ii) ■ md(ui)^ H + A,.md(ur) • md(ur)^ 

for some A; > 0, "^Zl^i ~ 1- Here r = raxikMdiy*)- The set {mi, • ■ • , Ur} is called an r-atomic 
representing support for moment matrix Md{y*)- AH the vectors Mi, • • • ,m,. can be shown to be 
global minimizers. They can be computed by solving some particular eigenvalue problem. We 
refer to |8| for flat extension conditions in moment problems and [12] for extracting minimizers. 



2.3. Exploiting sparsity in SOS relaxation 

As mentioned in the previous subsections, the size of matrix W in SOS relaxation is ("J''), 
which can be very large. So SOS relaxation is expensive when either n or d is large. This is true 
for general dense polynomials. However, if f{x) is sparse, i.e., its support = supp(/) is small, 
the size of the resulting SDP can be reduced significantly. Without loss of generality, assume 
(0, • • • ,0) £ J-. Then supp(/) — supp(/ — 7) for any number 7. 

Suppose f{x) ~ J — 4>i(^Y i'^ an SOS decomposition. Then by Theorem 1 in [29] we have 

supp(c/>i) C := ^ the convex hull of ^-^"^^ 

where = {a £ T : a is an even integer vector}. There exist some work [151 134] on exploiting 
sparsity further. Here we briefiy describe the technique introduced in [M] . 
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For polynomial f{x), define its csp graph G — {[n], E) such that (i, j) £ -E if and only if XiXj 
appears in some monomial of f{x). Let {Ci, C2, ■ ■ ■ , Cx} be the set of all maximal cliques of 
graph G. Waki et al. [31] proposed to represent f{x) — 7 as 

K 

f{x) — 7 = Si{x), each Si{x) being SOS supp(si) C Ci. 

Theoretically, when f{x) — 7 is SOS, the above representation may not hold (see Example 13. 5|) . 
And it is also difficult to find all the maximal cliques of graph G. Waki et al. [34] propose to 
replace {Ci, C2, ■ ■ • , Ck} by the set of all maximal cliques of one chordal extension of G. We refer 
to 3] for properties of chordal graphs. For chordal graphs, there are efficient methods to find all 
the maximal cliques. Chordal extension is essentially the sparse symbolic Cholesky factorization. 
See [34] for more details on how to get the chordal extension. 

We remark that in the worst case the sparse SOS relaxation above might be weaker than the 
general dense SOS relaxation even when the chordal extension is applied, as shown by Example l3.5l 

There is much work in exploiting sparsity in SOS relaxations. We refer to [9] 1141 1151 1271 134] 
and the references therein. 



3 The sparse SOS relaxation 

Throughout this paper, we assume f{x) — fi{xAi). Let |[A|| be the maximum cardinality 

of Ai, i.e., ||A|| — maxi |Ai|. We are interested in the case that ||A|| <^ n. To find the global 
minimum /* of f{x), we propose the following sparse SOS relaxation 

/a ~ max 7 

m 

s.t. ./(a;)-7Gl]^K<i[a;Aj'. 

1=1 

In terms of SDP, the above SOS relaxation is essentially the same as 

/a := max 7 (3.1) 

s.t. /(x) ~7 = ^md(a;Aj^mmd(xAj (3.2) 

i=l 

m >r 0, i = 1,- ■■ ,m. (3.3) 

Notice that (|3.2p is an identity. Let 

J^i = {a G N" : supp(q) C Ai, \a\ < 2d}, T = [jT,. (3.4) 

Write f{x) = X^a/a^"- Since f{x) = fi{xAi), /a 7^ implies that a £ T. By comparing 
coefficients of both sides of (|3.2|l , we have equality constraints 



i = l 1=1 17 + T = a 

Now we derive the dual problem for (|3.1|l - (|3.3p . Notice that constraint (|3.2p is equivalent to 
the equality constraints (|3.5p . Let y = {ya)asT be the Lagrange multipliers for equations in (|3.5[l . 
and Ui be the Lagrange multipliers for inequalities in (|3.3p . Each Ui is also positive semidefinite. 
The Lagrange function for problem (|3.1|l - (|3.3p is 

^ = 7+ (./■o-7-E^»(0'0)) 2^0+ E (/--E E W/i(»,,r) U<, + ^m.f/i 

m 

= 7(1 - J/o) + ^ /.yc + ^ ^ m(r?,r)([/i(r,,r)-y<. 
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So we can see that 



max £(7, Wi,yQ,C/,) 

-i-oo otherwise. 



Therefore the dual of (I3TT])-(I331) is 

/s := min ^ faVa (3.6) 

S.t. X^'(y) ^ 0, i = 1, ■ ■ ■ ,m (3.7) 

J/o = 1. (3.8) 



3.1. Complexity comparison 

Since the dual of the standard or sparse SOS relaxation not only returns the SOS lower bound 
but also provides the moment matrix to help extract minimizers, we compare the computational 
complexity of jm-dm and The LMI 1131) is of size = 0(n'*) and has = 

0{n^'') decision variables. At each step of an interior-point method (e.g., the dual scaling method 
[3]), the complexity for solving (^^ - (^3)) is 0{n^''). On the other hand, (fXT)! has m LMIs, which 

are of sizes at most ("'^|+'*) = 0(||A||'*), and O (m(ll'^|+^'*)) = 0{m\\Af'') decision variables. 

At each step of interior-point methods, the complexity for solving (|3.6p - (|3.8p is C(m^ || A||®'*). 
When II A II is independent of n and m — 0{n^) with p < 2d, then 

©(m^'llAf'') < 0{n''). 

Therefore (|33)) - (p!8)l is much easier to solve than (IO)) - (l2J)l . 

The complexity of sparse SOS relaxation in (3^ depends on the chordal extension of the csp 
graph. In the worst case, it can be as big as for the general SOS relaxation (|2.4p - (|2.6p . Let Q be 
the maximum size of the maximal cliques of the chordal extension. In practice, Q is often bigger 
than or equal to |1A||. When > ||A||, the SOS relaxation (|3.6p - (|3.8p is usually more efficient. 



3.2. Lower bound analysis 

RecaU that J^i = {a e N" : supp(Q) C Ai, |a| < 2d}. From the representation p.ip of f{x), 
we have 

supp(/) C U T^. 

i=l 

This leads us to think that the relaxation (|3.6p - (|3.8p should give reasonable lower bounds, although 

it might be weaker than the general SOS (see Example 13. 5p . 

Theorem 3.1. The optimal values f^,, Ia, fsos, f* satisfy the relationship 

fi ~ fl < fsos < / ■ 



Proof. The latter two inequalities are obvious because the feasible region defined by (|3.7p - (|3.8p 
contains the one defined by (|2.5p - (|2.6p . To prove the first equality, by the standard duality 
argument for convex program, it suffices to show that p.7p admits a strict interior point. Define 
y = {ya)a&T as ^ 

^ /k" ^ ^ dx 

For every nonzero vector ^ = {^a)asTi, we have 

So Afj^' (y) >- for every 1 < i < m. Therefore y is an interior point for (I3.6p - (l3.8p . which implies 
the strong duality fs = fA- C 
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Remark 3.2. Theorem [STT] implies that the lower bound /a given by (|3.1|l - (|3.3|) is weaker than 
the SOS lower bound fsos- There are examples such that < f*gg (see Example 13. 5p . However, 
in many numerical simulations, the lower bound is very useful. For randomly generated 
polynomials, as shown in Section 4, it frequently happens that = fsos- On the other hand, 
under some conditions, we can prove = f^^g. 

Suppose Ai, A2, ■ • • , Am satisfy the running intersection property: 

For every l<j<m-l, 3k <i such that Ai+i n ^[J Aj^ C At- (3.9) 

Theorem 3.3. Suppose (3.6p - (3.8p has a optimal solution y* such that each M^^in*) has a 
representing measure fii on R^* . // condition (3. ffl) holds, then = f^oa ■ 

Proof. For any Ai, Aj, M^^'^^^ (y*) is a common principle submatrix of A^^' (j/*) and A4^^ iu*)- 
So the marginals of measures /li are consistent, i.e., the restrictions of these measures on the 
common subspaces are the same. By Lemma 6.4 in [18], there exists a measure on R" such that 
Hi is the marginal of fi with respect to Ai for all i = 1, • ■ • , m. Define vector y such that 

Md{y) = / md{x)ind{x)'^ iJ.{dx). 

Then every Ai'^'iy*) is a principle submatrix of A4d{y)- So = Ha whenever supp(a) C Aj for 
some j. Since the /a 7^ implies supp(a) C Aj for some j, we know the objective value of (|3.6p 
is the same for y* and y. Thus f*^^ < f^. Since f*^^ > fl, we get f*^^ ^ .fl- □ 

Remark 3.4. The running intersection property (|3.9|l alone is not sufficient to guarantee the 
equality = f^os , as shown by the following example. 

Example 3.5. f{x) = fi{xi,X2)+f2{x2, xg) where /i — xf + {xiX2 — l)^ and /2 = xlxl + {xl~l)^ . 
Solving dense SOS relaxation (I2.H) - (|2.3|) and sparse SOS relaxation (|3.1|l - (|3.3|) numerically, we 
find that 

/i « 5.0 • 10"" < ^ 0.8499. 

Actually the minimum /* ~ 0.8650. First, solve equation \/f{x) — 0, and evaluate f{x) on 
these critical points, then we find the minimum of these critical values is about 0.8650. So 
/* < 1. Second, we prove that the minimum /* is attainable. Let {a;''''} be a sequence such that 
/(x'-'^') f* as k goes to infinity. We claim that the sequence {x^''^} must be bounded. 
Otherwise, suppose x^''^ — > 00. Thus at least one of coordinates a;^*', x^\ x'^J'^ 
should go to infinity. If either a;^*' or xij'^ goes to infinity, then f{x'^'^^) goes to 
infinity, which is not possible. So x''^^ 00. Since {f{x^''^)} is bounded, without 

1 £ 1-i {*) (fc) {*=) (fe) (fe) (fc) 

loss of generality, we assume x\ — > ai, x\ 'Xj ~^ oii2: X2 x^ 023, a^g — > 03 
for some numbers 01,012,023,03. If 03 = 1, then x^^ is convergent to 02, which 
is not possible. And, if 03 7^ 0, then x'^'^^x^^-' goes to infinity, which is also not 
possible. So 03 = 0, and hence 

/(a;"=')>(a;^-l)2^1>r, 

which is a contradiction. 
So the sequence {a;'*''} is bounded and has an accumulation point x* . Then we must have 
f{x*) — f , which means that /* is attained at some point. From the computation of critical 
values, we know /* ~ 0.8650. For this polynomial, both the dense and sparse SOS relaxation are 
not exact: < /'^^ < /*, and the method in [34] gives the same lower bound 

Corollary 3.6. // all fi are quadratic and condition 113. yp holds, then f*^^ = 
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Proof. When all fi are quadratic, i.e., di — 1, the entries of moment matrix Alf * are the first and 
second order moments. The positive semidefiniteness of Mf^ implies Alf ' has a representing 
measure. Then the conclusion is immediately implied by Theorem 13.31 □ 

Remark 3.7. If the running intersection condition (|3.9|) fails, then Corollary 13.61 is no longer 
true, as shown by the example below. 

Example 3.8. Consider the polynomial f{x) = fi{x\,X2) + f2{x2,x^) + /3(a;i,a;3) where /i = 
\{x\ + X2) + 2x^X2, f2 = 5(2:2 + xl) + 2x2x3, fs = ^{xl+ xl) + 2x1X3. In this case 

Ai = {l,2}, A2 = {2,3}, A3 = {1,3}. 

The running intersection property (|3.9p fails. But we have — —00 < fg^s = /* = 0. 



3.3. Extraction of minimizers 

In this subsection, we discuss how to extract minimizer(s) x* — {xl, ■ ■ ■ , x*). Suppose y* = 
{y%)a£T is one optimal solution to (|3.6|l - (|3.8p . Let 5i — {i} for every i. The entries of y* whose 
indices are supported in 5i are 

2/0 , Vei , , ■ ■■ , J/2de, 

which are the entries the moment matrix M^'{y*). So coordinate x* can be extracted from 
moment matrix M^'{y*) if it satisfies the fiat extension condition. Let Vi be the set of all the 
points that can be extracted from the moment matrix AI^'{y*). If Vi is a singleton, then x* has 
a unique choice. 

The situation is more subtle if some Vi has cardinality greater than one. Suppose for some 
i,j € [n] we have |Vj| > 1 and \Vj \ > 1. Can x* x* appear simultaneously in the optimal solution 
X* for arbitrarily chosen x* £ Vi, x* £ Vj? The answer is obviously no! For instance, the 
polynomial 

{xl - 1)^ + {xl - 1)^ + {Xi - X2)^ 

has only two global minimizers ±(1,1). We find that Vi = V2 ~ {1,-1}. But obviously (1,-1) 
and (—1, 1) are not global minimizers. 

Now what is the rule for matching x* and x* if |Vi| > 1 or \Vj\ > 1? So far we have not 
yet used the information of moment matrix M^^{y*). If M^'{y*) also satisfies the flat extension 
condition, we can extract the tuples a;^^. — {xDhs/^^ from M^^{y*). Let be set of all such 
tuples that can be extracted from M^^{y*). One might ask whether Vi and A'a; are consistent, 
that is, does x*^. £ X^t imply that xX G Vk for all k G Ai? Under the flat extension assumption, 
the answer is yes, which is due to the following theorem. 

Theorem 3.9. Suppose y* is one optimal solution to ll3.6p - (3.8\) such that all M^^{y*) satisfy 
the flat extension condition. Then for any x^ . £ X/\. , it holds that xl £ Vk for all k £ Ai. 

Proof. Let Xa^ = {a;^', 2;^', • • • , a;^'} be the r-atomic representing support for M^^{y*). Then 
we have decomposition 

r 

M^ivl = E ^irn2{x2)m2{x2f 

1=1 

for some Ai, • • • , Ar > 0, X^fci = 1- Notice that M^'' {y*) is a principle submatrix of M^^ {v*)- 
So we also have that for every k £ Ai 

r 

A^^(y*) = E ^^™2(4'')m2(4''f ■ 

1=1 

This mecLiis that {3;^ \ \ ■ • ■ , x )['} is a r-atomic representing support for moment matrix 
M^''{y*) (some x^^' might be the same). By the definition of Vi, we have {x^k \ ■ ■ ■ ,x^i^^} C Vk- 
□ 
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Theorem 3.10. Suppose y* is one optimal solution to Ii3.6p - (3.8fl such that all M^'{y*) satisfy 
the flat extension condition. Then any x* = [xX, ■ ■ ■ , x* ) with x^ £ Vk and x^. £ Xa^ for all k 
and i is a global optimal minimizer of fix). 

Proof. Fix X* as in the theorem. Since Mf^{y*) satisfies the fiat extension condition, we have 
the decomposition 

M^'iy*) = ^A.md(a;Ajm2(a;lJ^ + Ma. 
where 1 > Aa^ > and Ma^ cl 0. Now let A — min^ Aa; > and 

Ma. = (AAi - A)m2(a;lJm2(a;lJ^ + Ma; h 0. 

Notice that Ma^ and Ma^ are also moment matrices. Without loss of generality, we can assume 
A < 1, since otherwise each M^^{y*) has rank one and then x* is obviously a global minimizer. 
For every a G Ti, define ija = {x'^.)°' and y = {ya)aey^. Let y — {ya)aeT be the vector such that 
J/* = Ay + (1 - X)y. Then it holds 

M^'iy*) = XMf'iy) + (1 - X)Mf'{y). 

Obviously vector y is feasible for (|3.7|l - (|3.8|l since 

Mt^iy) = (Mt'iy*) - XMfHy)) = j^Ma. h 0. 

Since y* is optimal, we can see Y.^^jr .Uy^ < Y^^er f'^V'^ ^^'^ Y^aer f'^V^ ^ Y^aerf'^V'^- % 
linearity, it holds 

/a = ^ fcVa = ^ X] .Uva + (1 - 5Z f'^y"- 

Therefore, we must have X^^g^r/a^a = /a since < A < 1. On the other hand, by the definition 
of y, we know f{x*) = X^^g^ /aj/a ~ /a- Thus x* is one point at which the polynomial f{x) 
attains its lower bound /a, which implies that x* is a global minimizer of fix"). □ 

The algorithm for minimizing f{x) via sparse SOS relaxation (|3.ip - p.3p is as follows. 
Algorithm 3.11 (Minimizing sum of polynomials) . 
Input: n,m, Ai, fi{xAi) (i = 1, ■ ■ ■ y'm) 
Output: Vi and (i = 1, • • • , m) 
Begin 

Step 1: Solve the dual problem (|3.6|l - (|3.8|l . Get the optimal solution y* . 

Step 2: For each 1 < fc < n, find the set Vk of points that can be extracted from M^'' (y*). 

Step 3: For every k with |Vfe| > 1, find the set Aa^ from Mf^{y*) whenever fc G Ai. 

End 



As an example, let us illustrate how to solve the global optimization problem 



mm 



[xi - Vf + (Xx - X2)* + {X2 - X3) 



and find global minimizers. Here Ai — {1, 2} and A2 = {2, 3}. Solve the dual problem (|3.6|l -(f3 
and we get solutions 





1 
1 









1 1 1 

10 

10 

1 1 1 

1 1 1 

1 1 1 



10 



Both Ai^^{y*) and M^^{y*) have rank two and satisfy the flat extension condition. Using the 
technique from [T^, we can extract 



Vi = V2 = Vs = {-1,1} 

and 





-1 




1 


1 , = 1 


-1 




1 




{ 


-1 




1 


-1 




1 


} 



Since the a;2-component from Xai and Xa2 must be the same, we know there are two global 
miriimizers x* = ±(1, 1, 1). 



3.4. Nonlinear least squares problems 

Now we consider the special case that each fi{xAi) is a square of some polynomial, say, 
fi(x Ai) = glixAi)- Then the global minimization of f{x) — J]]^ fi{x Ai) is equivalent to solving 
the nonlinear least squares (NLS) problem associated with the polynomial system: 

gi{xAi) = g2(xA2) = ■■■ = gm{xA,J = 0. (3.10) 

In this situation, the polynomial function is often nonconvex and it is very difficult for general 
numerical optimization schemes like branch-bound to find the global minimizer of f{x). 

Theorem 3.12. If the polynomial system l3.1(Kl admits a solution, then the sparse SOS relaxation 
IS exact, i.e., = f*„^ = /*. 

Proof. Obviously /* = 0. And 7 = is a feasible solution to problem (|3.1|l - (|3.3p . since f{x) itself 
is a sparse SOS representation as in (|3.2p - (|3.3p . So /a > 0, and hence all the inequalities in the 
Theorem 13.11 become equalities. □ 

Remark 3.13. When the polynomial system (|3.10|l admits a solution, we necessarily have /* — 0. 
This might be trivial in some sense. However, the optimal solution y* to the dual problem (|3.6|l - 
(|3.8p can help recover the real zeros of polynomial system (|3.10|l . which are absolutely the global 
minimizers of f{x). See the example below. 

Example 3.14. Consider the sparse polynomial system 

2x1 - 3a;i + 2a;2 - 1 = 
2x^ + Xi^i - 3xi + 2xi+i - 1 = {i = 2, ■ ■ ■ , n - 1) 
2xj^ Xji — 1 '^Xji 1 — 0. 

This polynomial system is consistent and has at least two real solutions. Set n = 20. We apply 
sparse SOS relaxation (l3.H) - p.3|l to solve the least squares problem and get the lower bound 
fA ~ —2.0 ■ lO"'^^. Using the optimal dual solution, we obtain two real solutions (only the first 
four digits are shown) 

x = (1.8327, -0.1097, -0.5929, -0.6860, -0.7032, -0.7064, -0.7070, -0.7071, -0.7071, -0.7071, 

- 0.7071, -0.7070, -0.7068, -0.7064, -0.7051, -0.7015, -0.6919, -0.6658, -0.5960, -0.4164) 
£ = (-0.5708, -0.6819, -0.7025, -0.7063, -0.7070, -0.7071, -0.7071, -0.7071, -0.7071, -0.7071, 

- 0.7071, -0.7070, -0.7068, -0.7064, -0.7051, -0.7015, -0.6919, -0.6658, -0.5960, -0.4164). 



3.5. A sparser SOS relaxation 

From Theorem 13.121 we know the sparse SOS relaxation (|3.1|l - (|3.3|l is exact whenever the 
polynomial system (|3.10p admits a solution, and the optimal dual solution can help recover the 
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real zeros. This fact makes it possible to exploit the sparsity of each fi{x Ai) further. In (|3.1|l - 
(|3.3p . we assume each fi{xAi) is a dense polynomial. However, if each fi{xAi) is sparse, we can 
get a sparser SOS relaxation. It is obvious that 

supp(/i) C C/j + C/j 

where Qi is the convex hull of {a € N" : 2a £ supp(/i)}. This motivates us to propose the sparser 
SOS relaxation 

/a, ~ max 7 (3-11) 

S.t. f{x)-'y = J2'^sAxfW^mg^{x) (3.12) 

1=1 

m ^ 0, i = I,--- ,m. (3.13) 

Here mg-{xAi) is the column vector of all monomials in x with exponents from Qi. The size of 
matrix Wi is equal to the cardinality of Gi. Similar to (|3.ip - (|3.3p . the dual of (|3.1ip - (|3.13p can be 
derived to be 

/s^— min ^/ct/Q (3.14) 

a 

s.t. Mg,iy)tO,i = i,---,m (3.15) 
yo = 1. (3.16) 

Here the sparse moment matrix Mg^ {y) is indexed by vectors from Qi and defined as 

Mgi{y){a,f3) = y^+fj 

for all a,P £ Qi. 

Theorem 3.15. The optimal values /^^, /a^, /ei /aj .fsos, .f satisfy the relationship 

/Ss ~ JAs < /s = /a < fsos < /* ■ 

Proof. Applying the standard duality theory in convex programming as in the proof of Theo- 
rem [3Tl] we can get the first equality from the left by proving p.l2p - p.l3p has a strict interior 
point. Since the relaxation (|3.1ip - (|3.13p is a special case of (|3.ip - (|3.3p . we obtain the first inequal- 
ity from the left. The other relations follow Theorem 13.11 □ 

Theorem 3.16. Suppose fi{xAi) = QiixAi). If the polynomial system {223^ admits a solution, 
then the sparse SOS relaxation fy.ll[l - ll3.13p is exact, i.e., /^^ = fsos = ,f*- 

Proof. The proof is almost the same as for Theorem 13.121 Obviously /* = 0. And 7 = is a 
feasible solution, since f{x) itself is a sparse SOS representation as in (|3.12p - (l3.13p . So fA^ > 0, 
and hence all the inequalities in the Theorem [STTS] become equalities. □ 

Remark 3.17. When the polynomial system (|3.10p admits a solution, we must have /* = 0. 
This lower bound itself might not be interesting. However, the optimal dual solution y* to (|3.14p - 
(|3.16p can help recover the real zeros of polynomial system (|3.10p . which are absolutely the global 
minimizers of f{x). This observation is very important and has many applications. See examples 
in Subsection 5.1. 



4 Numerical examples 

In this section, we present some numerical experiments using sparse SOS relaxations (|3.ip - (|3.3p 
and (|3.1ip - (|3.13p . First, we use them to solve some test problems from unconstrained optimiza- 
tion. Second, we generate various random polynomials, test the performance of these sparse SOS 
relaxations and compare with other methods. All the computations are implemented on a Linux 
machine with 0.98 GB memory and 1.46 GHz CPU. The SOS relaxations are solved by softwares 
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SeDuMi [32) and YALMIP [20]. Throughout this section, the computation time is in CPU sec- 
onds. The accuracy of relaxations is measured by n^J/^i^^^^^^)!) . where x is one extracted solution 
and / is the computed lower bound. 

4.1. Some global optimization test problems 

In this subsection, we apply SOS relaxations (|3.1[) - (|3.3|l and ([3.1ip - p.l3[) to solve some global 
optimization test problems from [71 1211 [2^ . The relaxation (|3.ip - (|3.3[) is usually applied when 
each /i(Ai) is almost dense, and the sparser relaxation (|3.11[) - (|3.13[) is usually applied when each 
fi{Ai) is sparse. All the test functions in this subsection have global minimum /* — 0. So we use 
the absolute value of the lower bounds or /^^ to measure the accuracy of the relaxation. 

First, consider the following test functions. 

• The chained singular function [7]: 

f{x) = X] + Wxi+i)^ + 5{xi+2 - Xi+s)'^ + {xi+1 - 2xi+2)'^ + W{xi - 10a;i+3)'') 

where J = {1, 3, 5, ■ • • , n — 3} and n is a multiple of 4. The factor is used to scale the 
coefficients to avoid numerical troubles. 

• The chained wood function [7] 

fix) = X](100(^^+i - ^?)' + (1 - + 90(2;,+3 - s?+2)' + 

(1 - Xi+2)'^ + 10(a;i+i + Xi+a - 2)^ + 0.1(a;i+i - Xi+3)^) 

where J = {1, 3, 5, • • • , n — 3} and n is a multiple of 4. 

• The generalized Rosenbrock function [22| : 

n 

We apply SOS relaxation (|3.1[) - (|3.3|l to minimize these polynomial functions. The relaxation (|3.1|l - 
(|3.3p is solved by softwares SeDuMi and YALMIP. The accuracy and consumed CPU time are in 
Table[T] The problems are solved from size 100 to 500. For these polynomials, the relaxation (|3.1|l - 





chained singular 


chained wood 


gen. Rosen. 


n 


accu. 


time 


accu. 


time 


accu. 


time 


100 


3.2e-09 


2.72 


3.5e-10 


1.52 


9.0e-8 


0.95 


200 


3.Ge-10 


5.29 


3.7e-10 


2.25 


1.8e-7 


1.46 


300 


5.Ge-09 


8.01 


3.8C-10 


3.19 


2.7e-7 


2.24 


400 


5.0e-10 


11.64 


3.9e-10 


4.12 


3.6e-7 


2.88 


500 


4.9e-09 


33.09 


3.9e-10 


5.12 


4.5e-7 


3.45 



Table 1: The performance of sparse SOS relaxation ()3.ip - p.3p 



(|3.3p is almost the same as the one in [34]. This is because the csp graphs of these polynomials 
are chordal graphs. However, if the csp graphs are sparse but their chordal extensions are much 
dense, then the relaxation in [34] is very similar to the dense SOS relaxation. In such situations, 
the relaxation (I3.ip - (|3.3|) might be more suitable. For example, to minimize the sparse polynomial 

{xi + xl- if + {xl + xl-if -\ h (xl^i + xi- if + {xi + xi- if, 

the chordal extension of the csp graph is the complete graph, and hence the sparse SOS relaxation 
using chordal extension is the same as the dense SOS relaxation. However, the sparse relaxation 
(|3.ip - p.3|) is very suitable for this problem. 

Second, consider the following test functions. 
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• Broyden tridiagonal function [21) : 

fix) = ^ ((3 - 2x,)x, - Xi-i - 2xi+i + l) 

i = l 

where xq = Xn+i = 0. 

• Broyden banded function [21) : 

fix) = J2 ^'(2 + 10a;?) + 1 - ^ (1 + xj)x, 

where J; — {j : j ^ i, max(l, i — 5) < j < min(n, i + 1)}. 

• Discrete boundary value function [7]: 

^ 1 2 

fix) = J2 {^^^ ~ ^'-1 - ^'+1 + 2 + + ^)^) 

i — 1 

where /i — U = i/i, xq = a;„+i = 0. 

These three polynomials have sparse summand polynomial /^.. So we apply the sparser SOS 
relaxation p.lH) - (|3.16p and solve it by using softwares SeDuMt and YALMIP. The computational 
results are in Table [2l All the problems are solved quite well in a few seconds. 



Broyden Tridiagonal 


Broyden banded 


disc, bound val. 


n 


accu. 


time 


n 


accu. 


time 


n 


accu. 


time 


100 


1.2e-7 


2.65 


10 


3.6e-ll 


9.72 


10 


6.0e-12 


0.92 


200 


2.3e-7 


2.69 


15 


2.2e-10 


17.28 


20 


3.4e-ll 


1.57 


300 


5.0e-7 


3.58 


20 


1.6e-10 


25.27 


25 


1.6e-ll 


2.28 


400 


3.0e-6 


4.53 


25 


1.8e-10 


35.19 


30 


l.le-11 


2.47 


500 


4.1e-6 


5.44 


30 


4.9C-10 


45.30 


35 


3.9e-ll 


3.00 



Table 2: The performance of sparse SOS relaxation ()3.1ip - (|3.13p 



For the Broyden tridiagonal function, we can also apply the sparse relaxation (I3.1|I - (I3.3|I or 
chordal extension from [34] ■ They are slightly more expensive. For n = 500, the problem can be 
solved in about ten seconds with similar accuracy. However, for the Broyden banded function and 
discrete boundary value function, the relaxation (|3.ip - (|3.3p and the method in [34] are much more 
expensive. For instance, when n has values 10 or bigger, they are usually difficult to implement 
due to computer memory restrictions. 

One interesting observation in Table[2]is that the accuracy for the Broyden tridiagonal function 
is not as high as for the other two functions. One possible reason is that the global minimizer 
of Broyden tridiagonal function is not unique and there are additional numerical troubles caused 
from extracting minimizers. This illustrates that the computation is more numerically difficult 
when there are multiple global solutions. 



4.2. Randomly generated test problems 

In this subsection, we present the computational results for randomly generated polynomials. 
The aim is to test the performance of the sparse SOS relaxation (I3.ip - (|3.3p for minimizing random 
polynomials and compare with other sparse SOS methods. For these randomly generated polyno- 
mials, solve the sparse relaxation (|3.1|l - (|3.3|l by using softwares SeDuMi and YALMIP. Then we 
get lower bounds and extract minimizers x. Since we do not know the true global minimizers 
in advance, the accuracy of x can be measured by err = max{°i |/{^)|} ' smaller err is, the 
more accurate x is, since is a guaranteed lower bound. 
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4.2.1 Randomly generated sums of small polynomials 



In this subsubsection, we randomly generate sparse polynomials f{x) of the form Hl.lf) and use 
them to test the performance of the sparse relaxation (|3.1|l - (|3.3|) . Then the csp graph of f{x) is 
usually not chordal and its chordal extension is often much less sparse. So the method in [34] is 
usually expensive for these polynomials. We let m = n and choose fi to have form 

fiixA,) = md{xA,)'^ ■ Ai ■ md(xA,) + b'[m2d~i(xA,) 

where Ai are chosen to be random subsets of [n] with cardinality at most ||A||. Here Ni = (''^'j''''') i 

Ai = n/jv. + BB"^, B e R^'^^' and bi £ ^-i ) are random. So each Ai is positive definite. 
This choice guarantees that the global minimizers of f{x) are contained in some compact set. 





l|A||=3 


l|A||=4 




CPU seconds 


accu 


CPU seconds 


accu 


n 


max 


avr. 


min 


max 


max 


avr. 


min 


max 


20 


0.85 


0.62 


0.54 


4.1e-9 


1.46 


1.15 


0.91 


2.4e-9 


40 


1.22 


1.07 


0.91 


1.9C-9 


2.86 


2.49 


2.25 


2.9e-9 


60 


1.80 


1.55 


1.45 


2.9e-9 


4.43 


4.17 


3.91 


3.1e-9 


80 


2.30 


2.18 


2.02 


2.3e-9 


6.26 


5.94 


5.24 


3.7C-9 


100 


3.02 


2.70 


2.33 


2.8e-9 


7.85 


7.41 


7.01 


5.0e-9 



Table 3: Computational results for quartic polynomials with different sizes 

First, let 2d = 4 and n be 20,40,60,80,100. For each ||A|| (3 or 4) and n, we generate 100 
random polynomials in the way described above. For each one, we solve the sparse SOS relaxation 
p.ip - p.3[) by using softwares SeDuMi and YALMIP, and get the lower bound and optimal 
dual solution y. For all these randomly generated polynomials, the moment matrices A^^'(y) 
have numerical rank one. So we can easily extract the minimizer x. The maximum, average and 
minimum of consumed CPU time are in Table [3] For these random polynomials, we just record 
the maximum error of the extracted minimizers. From Table |3] for A = 4, we can find the global 
minimizer of a quartic sparse polynomial of 100 variables with error O(10~^) within about 8 CPU 
seconds. 





IIAII-3 


l|A|I=4 




CPU seconds 


err 


CPU seconds 


err 


2d 


max 


avr 


min 


max 


max 


avr 


min 


max 


4 


1.01 


0.87 


0.77 


2.5C-9 


2.33 


1.93 


1.65 


2.4c-9 


6 


3.22 


2.96 


2.67 


1.8e-9 


17.16 


14.92 


11.71 


2.2e-9 


8 


13.07 


11.44 


10.13 


1.7e-8 


136.67 


119.90 


107.28 


9.4e-8 



Table 4: Computational results for polynomials of size n = 30 with different degrees 

Second, let n = 30 and 2d be 4,6,8. For each ||A|| (3 or 4) and 2d, we generate 100 random 
polynomials in the way described in the above. For each one, solve the sparse SOS relaxation 
p.ip - (|3.3|) by using softwares SeDuMi and YALMIP, and get the lower bounds and optimal 
dual solution y. Similarly, all moment matrices A^^'(y) have rank one, and the minimizer x can 
be extracted easily. The maximum, average and minimum of the consumed CPU time, and the 
maximum error of extracted minimizers are in Table [l] For |[A|| — 4, the global minimizer of 
such generated polynomials of degree 8 and 30 variables can be found with error O(10~*) within 
about 120 seconds. 

We remark that the sparsity technique in [34] is too expensive to be implementable for min- 
imizing these random polynomials generated in the way as above because of computer memory 



15 



limitations. For these random polynomials, the sparsity technique using chordal extension is al- 
most as expensive as the general dense SOS relaxation. This is because the chordal extensions 
of csp graphs of these polynomials are usually much more dense than the original csp graphs. 
However, as we have seen in the above, the SOS relaxation (|3.1|) - (|3.3|l is very suitable for these 
polynomials. 



4.2.2 Random sparse polynomials with given chordal extension 

In this subsubsection, we generate random sparse polynomials in a similar way as in [3j, and 
compare the performance of our sparse SOS relaxation (|3.ip - p.3|) with the one in [Sj using 
chordal extension. Generate a chordal graph randomly such that the size of every maximal clique 
is at most 6. Let {Ci, • • ■ , Cm} be the set of maximal cliques. If we choose Ai = d, then the 
sparse SOS relaxation (|3.1|l - (|3.3|) is the same as the one using chordal extension. Therefore, to 
make a reasonable comparison, for each d, we choose a random subset A; C d. Choose each 
small polynomial fi to have the form 

Here Ni = At = n/jv, + BB^, B e K^-x^* and bi e R('^'i-?~') are random. The 

global minimizers of f{x) = "^Zi fii^Ai) generated as above always exist and are contained in 
some compact set. 





relaxation ([3T|l-((3?3l) 


relaxation using chordal extension 




CPU seconds 


accu 


CPU seconds 


accu 


n 


max 


avr. 


min 


max 


max 


avr. 


min 


max 


20 


1.75 


1.21 


0.96 


6.8e-9 


2.15 


1.78 


1.43 


5.5e-9 


40 


3.07 


2.69 


2.24 


7.5e-9 


4.08 


3.51 


3.12 


4.9e-9 


60 


4.99 


4.54 


3.82 


6.7e-9 


7.88 


6.93 


5.65 


6.4e-9 


80 


6.59 


5.87 


5.23 


6.3e-9 


10.84 


9.57 


8.59 


5.7e-9 


100 


9.34 


7.64 


7.11 


7.2e-9 


13.45 


12.76 


11.74 


4.3e-9 



Table 5: Comparison with chordal extension on quartic polynomials 



For polynomials randomly generated as above, the technique in ^34^ using chordal extension 
is a good choice, because there exists one sparse chordal extension of the csp graph. Now we 
compare the computational results for these two methods. 





relaxation (l3T|)-([3?3l) 


relaxation using chordal extension 




CPU seconds 


accu 


CPU seconds 


accu 


2d 


max 


avr. 


min 


max 


max 


avr. 


min 


max 


4 


2.87 


1.98 


1.35 


7.2e-9 


3.06 


2.21 


1.69 


4.3e-9 


6 


22.61 


16.78 


10.53 


6.9e-9 


32.15 


23.91 


13.51 


5.1e-9 


8 


193.45 


131.17 


98.75 


6.7e-9 


253.79 


186.84 


112.37 


5.8e-9 



Table 6: Comparison with chordal extension on polynomials with 30 variables 



First, let 2d — 4 and n be 20, 40, 60, 80, 100. For each n, generate 50 random polynomials 
as above. For each of these random polynomials, solve the relaxation (I3.1|l - p.3p . find a chordal 
extension of the csp graph of f{x) and then apply the sparse relaxation in ^34 . Both relaxations 
are solved by softwares SeDuMt and YALMIP. Then we extract minimizers x from moment 
matrices. The computational results are in Table [S] For these solved problems, we just record 
the maximum error of the relaxation. Second, let n = 30 and 2d be 4, 6, 8. For each 2d, generate 
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50 random polynomials as above. For each one, solve the problem by the relaxation (|3.ip - (|3.8p 
and the one in [34) using chordal extension. They are solved by softwares SeDuMt and YALMIP. 
The computational results are in Table [S] 

From Tables [5] and [S] we observe that for polynomials randomly generated as above the 
sparse SOS relaxation (|3.ip - (l3.3p is slightly more computationally efficient than the one using 
chordal extension. As we can see, for these random polynomials, there is not much difference 
between the qualities of these two kinds of sparse SOS relaxations. The distinction between their 
qualities depends on specific problems. Of course, theoretically the sparse relaxation using chordal 
extension in [34j is at least as tight as the relaxation (|3.ip - (|3.3l) . 



4.2.3 Random dense polynomials 

In this subsubsection, we test the performance of our sparse SOS relaxation on minimizing general 
dense polynomials. We observe that every polynomial f{x) is a summation of monomials whose 
number of variables is at most the degree deg(/). So the sparse SOS relaxation (|3.ip - (|3.3p is 
attractive when the degree 2d is small like 4. We generate the random dense polynomials as 
follows 

f{x) = md(x)^ • A ■ md{x) + b'^m2d-iix). 

Here N = ("+'*), A = uIn +BB'^ , B £ R'^ is a. random matrix and h £ r("^^^''"') is a random 
vector. So the global minimizers of f[x) generated this way are contained in some compact set. 
Note that f{x) is also a summation of small polynomials. Let Ai be the subsets of [n] with 
cardinality 2d. Then we can write 

i = l 

for some small polynomials fi{xAi)- 



n 


16 


17 


18 


19 


20 


21 


22 


23 


max time 


335.29 


569.74 


901.32 


1505.45 


2249.19 


3257.86 


4734.25 


7060.72 


avr. time 


241.48 


455.32 


751.69 


1245.22 


2070.70 


2989.45 


4497.84 


6419.53 


min time 


205.60 


397.11 


688.58 


1052.70 


1893.02 


2676.62 


4197.95 


5874.28 


accuracy 


7.3e-9 


6.7e-9 


7.4e-9 


6.9e-9 


8.1e-9 


6.5e-9 


7.9e-9 


8.5C-9 



Table 7: Computational results for dense quartic polynomials 



Since ||A|| = 2d, which should not be big for the effectiveness of the sparse relaxation (|3.1|l - 
([T3ll . we test for the case that 2d = 4. Let n be 16, 17, 18, 19, 20, 21, 22, 23. For each pair (n, d) 
of these values, generate 50 random examples as above. For each random polynomial, solve the 
sparse relaxation (|3.ip - (|3.3p by using softwares SeDuMi and YALMIP. The consumed CPU time 
and the accuracy of relaxation are in Table [T] We can see that the obtained solutions are very 
good within reasonably acceptable time. When n > 24, the sparse relaxation (I3.ip - (l3.3p is then 
also too expensive to be implementable due to computer memory restrictions. 

For these randomly generated dense polynomials, the general dense SOS relaxation and sparse 
SOS relaxations like in [34] are not implementable for n > 16, due to either computer memory 
shortage or the consumed time being more than 10 hours. However, when 2d is small like 4, the 
sparse SOS relaxation p.ip - (|3.3p can solve bigger dense polynomial optimization problems which 
can not be solvable by other methods. 

5 Applications 

Minimizing a summation of small polynomials arises in various applications. Many big poly- 
nomials in applications often come in this form. In such situations, the sparse SOS relaxation 



17 



p.ip - p.3|) or p.lip - (|3.13p is very useful. In this section, we show some applications in solving 
sparse polynomial systems and sensor network localization. 



5.1. Solving sparse polynomial system 

Suppose we are trying to solve the sparse polynomial system 

gii^Ai) = 0, g2{xA2) = 0, ■ ■ ■ , 5m(xA„) = 0. 

In some applications, these equations are redundant or even inconsistent. When the polynomial 
system does not admit a solution, we want to seek a least squares solution, which is often useful 
in applications. 

This problem can be formulated as finding the global minimizer of the sparse polynomial 

m 
i = l 

The polynomial system has a real zero if and only if /* = 0. When /* = 0, the global minimizers 
are precisely the real zeros of the polynomial system. When /* > 0, the global minimizers are 
the least squares solutions. 

One important sparse polynomial system of the above form is from computing the numerical 
solutions of nonlinear differential equations. Consider the two-point boundary value problem 
(BVP) 

F{t,x,x ,x") = Q, x{a) — a, x{b) = P 

where F{t, x, x' , x") is polynomial function in t, x, x' , x" . To find the numerical solution, the 
central difference approximation with a uniform mesh is often used to discretize the derivatives. 
Let A'' be a positive integer and set h — . Then we get polynomial difference equations 

Fl^t,,x,, , J =0,fc = l,... ,N 

where xq ~ a, xn+i = (3 and tk = a + hk. Every polynomial on the left involves 2 or 3 variables 
Xk-i, Xk, Xk+i- So this is a sparse polynomial system. There are several methods for solving 
this kind of polynomial system, like Newton's method and homotopy methods. Newton's method 
is very fast, but often require an accurate initial guess. Homotopy methods do not require a 
"satisfactory" guess and work well for small N, but are expensive to implement for large A^. We 
refer to [I] and the references therein for work in this area. When A*' is large, this polynomial 
system is large but sparse. We solve this system by applying the sparse SOS relaxation (|3.11|l - 
p.l6p for big A'' (up to 100 or even bigger). 

Example 5.1 (X,). Consider a basic BVP 

a;"-2a;^ = 0, x(0) = i, x(l) = i. 

The exact solution to this problem is x{t) — -j-^. Now we discretize the differential equation with 
mesh size h = j^pi i then get the difference equation 

^ ~ 2x1 + X2 - 2h^xl = 0, 
Xk-i - 2xk + Xk+i - 2h'xl = 0, fc = 2, • ■ ■ , Af - 1 
xn~i - 2xN + ^ ~ 2h'^x% = 0. 

This is a polynomial system about xi,X2,--- ,xn- We can solve this polynomial system as a 
nonlinear least squares problem by applying sparse SOS relaxation (|3.11|I - H3.16|) . The computa- 
tional results are in Table |8l The equation error is defined to be the infinity norm of the residuals 
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IIt, - rlt, )\\ Ih'-' 
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9 ^9Q1 p niT^ 


n ^9 


10 






2 3fi8np-nn3 

. ij U L/ C \J\J\J 


0.77 


20 


5.2879e-07 


1.5041e-05 


6.6331e-003 


1.18 


30 


2.6194e-07 


1.9413e-05 


1.8656e-002 


2.09 


40 


3.0304e-07 


4.3344e-05 


7.2861e-002 


3.99 


50 


6.5375e-07 


1.5124e-04 


3.9338e-001 


6.82 


60 


1.5271e-06 


4.8695e-04 


1.8119e+00 


7.77 


70 


1.2555e-06 


5.2428e-04 


2.6429e+00 


9.16 


80 


9.7315e-07 


6.1330e-04 


4.0239e+00 


9.78 


90 


2.7519e-06 


1.9311e-03 


1.5991e+01 


10.81 


100 


1.8628e-06 


8.1425e-04 


8.3062e+00 


8.79 



Table 8: The performance of (|3.ip - (|3.8p solving the equations in Example 1 5. II 



of the left hand side of the polynomial system, which measures the quality of how the polyno- 
mial systems are solved. The obtained solutions have equation error from 0(10^'^) to 0(10"'^). 
If we want to make them more accurate, they can be used as the initial guesses in Newton's 
methods for refining. The accuracy of the discretization is defined to be the difference between 
computed solution and true solution x{tk) ~ JTt^ where tk = -jyTT' Since the discretization 
has error 0{h^), we expect that \\xk — x{tk)\\cxj/h^ is a constant. When A'' < 40, we can see that 
\\xk — x{tk)\\oo / hi^ is almost constant. When A'^ > 50, SeDuMi experienced numerical troubles, and 
the returned solutions are not as accurate as for the smaller Ns. This explains why \\xk —x{tk)\\oo 
and ll^fc — x{tk)\\cx,/h^ becomes bigger for A'' > 50. Time records the CPU seconds consumed by 
the SDP solver SeDuMi. For A'' — 100, the computation takes less CPU time than for A'^ = 80 or 
A'^ = 90. This is because the numerical troubles make SeDuMi terminate earlier. 

Example 5.2. Consider another BVP 

^" + ^ix + t)^ = 0, a;(0) = 0, x(l) = 0. 

Now we discretize the differential equation with mesh size h = jjqrj, then get the difference 
equation 

2xi - X2 + ^h\xi + tif =0 

2xi - Xi-i - x,+i + ^h^{x, + Uf = 0, i = 2, ■ • • , A^ - 1 

2xn — xn-1 + ^h^(xN + tn)^ = 

This is a polynomial system about xi,X2, - ■ ■ ,xn- We can solve this polynomial system as a 
nonlinear least squares problem by applying sparse SOS relaxation (|3.1Hl - p.l6p . When A'' = 30, 
we get the following real solution within about 2.5 CPU seconds (only the first four digits are 
shown) : 

(-0.0159, -0.0312, -0.0459, -0.0600, -0.0735, -0.0864, -0.0985, -0.1099, -0.1205, -0.1302, 
-0.1391, -0.1470, -0.1540, -0.1599, -0.1646, -0.1682, -0.1705, -0.1715, -0.1710, -0.1689, 
-0.1651, -0.1596, -0.1521, -0.1425, -0.1307, -0.1164, -0.0995, -0.0796, -0.0567, -0.0302). 



5.2. Sensor Network Localization 

The sensor network location problem is basically described as follows: find a sequence of 
unknown vectors xi,X2, - • ' y^n € M.^{k — 1, 2, ■ ■ ■ ) (they are called sensors) such that distances 
between these sensors and some other known vectors ai,--- ,am (they are called anchors) are 
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equal to some given numbers. Now each Xi itself is a fe- dimensional vector. To be more specific, 
let A = {{i,j) e [n] x[n] : i < j, \\xi-Xj\\2 = dij}, and B = {{i, k) £ [n] x [m] : ||a;i-0(,||2 = e,fc}, 
where dij , eik are given distances. Then the sensor network localization problem is to find vectors 
xi,X2, - ■ ■ ,Xn such that ||a;i — a;j ||2 = dij for all € A and \\xi — ak\\2 — etk for all (i, k) £ B. 
Notice that A and B only give some partial pairs of distances. A does not contain all the pairs 
such that i < j, and neither does B. 

Sensor network localization is also known as the graph realization problem or the distance 
geometry problem. Given a graph G = {V, E) along with a real number associated with each edge, 
graph realization is to assign each vertex a coordinate so that the Euclidean distance between any 
two adjacent vertices is equal to the real number associated with that edge. 

The locations of sensors can be determined from the polynomial system 

\\xi - Xj\\l = d%.y{i,j) G A, 
\\xi - ah\\l = eiky{i,k) G B. 

Usually solving this polynomial system directly is very expensive. Here we solve this polynomial 
system as a nonlinear least squares problem. Minimize the quartic polynomial function 

f{x)~ X] {\\x^ ~ XjWl - d^jf + ^ - akWl - eft:)'^ . (5.1) 

{i.j)eA (i.k)eB 

where x — [xi, ■ ■ ■ , Xn]. x* is a solution to sensor network localization problem if and only if 
X* is a global minimizer of f{x) such that f{x*) = 0. When x* is a global minimizer such that 
fix") > 0, the distances di^j and Bit are not consistent, and x* is a solution in the least squares 
sense. This polynomial f{x) is of the form ((TTTJ, and our sparse SOS relaxation (|3.1|l - (|3.3p can 
be applied to solve the problem. 

We randomly generate test problems which are similar to those given in [6]. First, we randomly 
generate n = 500 sensor locations xl, ■ ■ ■ ,a;* from the unit square [—0.5, 0.5] x [—0.5, 0.5]. The 
anchors {ai , 02, aa, a4} (m = 4) are chosen to be four fixed points (±0.45, ±0.45). Choose edge 
set A such that for every sensor x* there are at most 10 sensors x* {j > i) with G A and 

[|a;*— a;j||2 < 0.3. For every (i,j) G ^, compute the distance [[a::*— a::*||2 = di-j. Choose edge B such 
that every anchor is connected to all the sensors within distance 0.3. For every (i, k) G B, compute 
the distance \\x* — afc|[2 = Cife. Then we apply sparse SOS relaxation p.lfl - p.SI) to minimize 
polynomial function (|5.ip . The accuracy of computed sensor locations x\, - ■ ■ , i„ will be measured 

by the Root Mean Square Distance (RMSD) which is defined as RMSD = (i YJ^^i W^i - II2) ^ • 
We use SeDuMi to solve the sparse SOS relaxation (I3.1|I - (I3.8|I on a Linux machine with 1.46 
GHz CPU and 0.98GB memory. The problem can be solved within about 18 CPU minutes with 
accuracy 0(10''^). 

We refer to [24j for more details about sparse SOS methods for sensor network localization. 

6 Conclusions and discussions 

This paper proposes sparse SOS relaxations for minimizing polynomial functions that are sum- 
mations of small polynomials. We discuss various properties of these relaxations and the com- 
putational issues. We also present applications of this sparsity technique in solving polynomial 
equations derived from nonlinear differential equations and sensor network localization. As a spe- 
cial case, this sparsity technique provides a heuristic approach to solve bigger dense polynomial 
optimization problems. 

In order to exploit the sparsity, the polynomial and its SOS representation must be sparse. 
In many applications, the polynomials are often given with sparsity pattern (|l.ip . and then the 
sparsity technique proposed in this paper is very suitable. If the sparsity pattern is not given, 
one important future work is how to represent the polynomial in a sparse pattern such that the 
technique proposed in this paper is most efficient. Of course, one simple choice is to consider each 
monomial as a small polynomial. 

The idea of this sparse SOS relaxation can be applied in a similar way to solve constrained 
polynomial optimization problems, provided the objective and constraint polynomials are also 
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sums of small polynomials. See Kim et al. [13] and Lasserre [18] for related work. To get the 
global minimum, high order relaxations are usually necessary. Lasserre [T^ proved the conver- 
gence under the running intersection property. However, unlike the general dense SOS relaxation 
for minimizing polynomials over compact sets, the convergence might fail when the running inter- 
section property does not hold. As a counterexample, consider the Mtnimum Cover Set Problem. 
Let G = (V, E) be a graph with vertex set V = [3] and edge set E = {{1,2), (1, 3), (2, 3)}. To find 
the minimum cover set is equivalent to solving 

min /i (a;Ai ) + f2 (xa^ ) + fs {XA3 ) 
,2 2 2 

S.t. = Xi, X2 = X2, X-i, = X-i, 

xi + X2> 1, a;i -I- 2:3 > 1, a::2 + 3:3 > 1 

where Ai = {1,2}, A2 = {1,3}, A3 = {2,3} and /i(xai) = \{xi + X2), ^(xaJ = \{xi + 
faix/^^) — \{x2 + S3). The running intersection property now fails. However, we can prove 
that the global minimum /* = 2 and the lower bounds given by sparse SOS relaxations are at 
most |. The sparse SOS relaxations do not converge for this example. 

Another important future work is to apply the sparse SOS relaxations in solving big real sparse 
polynomial systems arising from nonlinear differential equations. 

Acknowledge: The authors wish to thank Bernd Sturmfels and the referees for helpful sugges- 
tions to improve this paper. 
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